A Method for Unsupervised Broad-Coverage Lexical Error Detection and Correction
نویسندگان
چکیده
We describe and motivate an unsupervised lexical error detection and correction algorithm and its application in a tool called Lexbar appearing as a query box on the Web browser toolbar or as a search engine interface. Lexbar accepts as user input candidate strings of English to be checked for acceptability and, where errors are detected, offers corrections. We introduce the notion of hybrid n-gram and extract these from BNC as the knowledgebase against which to compare user input. An extended notion of edit distance is used to identify most likely candidates for correcting detected errors. Results are illustrated with four types of errors.
منابع مشابه
Design and implementation of Persian spelling detection and correction system based on Semantic
Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors. Also developing Persian tools will provide Persian progr...
متن کاملContext-based Speech Recognition Error Detection and Correction
In this paper we present preliminary results of a novel unsupervised approach for highprecision detection and correction of errors in the output of automatic speech recognition systems. We model the likely contexts of all words in an ASR system vocabulary by performing a lexical co-occurrence analysis using a large corpus of output from the speech system. We then identify regions in the data th...
متن کاملInferring Knowledge with Word Refinements in a Crowdsourced Lexical-Semantic Network
Automatically inferring new relations from already existing ones is a way to improve the quality and coverage of a lexical network and to perform error detection. In this paper, we devise such an approach for the crowdsourced JeuxDeMots lexical network and we focus especially on word refinements. We first present deduction (generic to specific) and induction (specific to generic) which are two ...
متن کاملAn approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کاملCombining of Magnitude and Direction of Change Indices to Unsupervised Change Detection in Multitemporal Multispectral Remote Sensing Images
In remote sensing, image-based change detection techniques, analyze two images acquired over the same area at different times t1 and t2 to identify the changes occurred on the Earth's surface. Change detection approaches are mainly categorized as supervised and unsupervised. Generating the change index is a key step for change detection in multi-temporal remote sensing images. Unsupervised chan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009